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FINAL  TECHNICAL  REPORT 


IMPROVED  NOISE  SUPPRESSION  FOR  MULT I BAND  IMAGERY 


1.0  INTRODUCTION 

This  research  has  developed  and  tested  an  approach  to  improve¬ 
ment  of  noise  degraded  multiband  imagery  based  on  a  class  dependent 
Karhunen-Loeve  (K-L)  transformation  and  transform  domain  spatial 
filtering.  The  technique  is  attractive  because  of  its  tendency  to 
preserve  the  high  spatial  frequency  information  (e.g.,  edges)  in  the 
picture  as  compared  to  conventional  minimum  mean  square  error  linear 
filtering  of  the  original  bands.  By  multi-band  we  mean  a  set  of 
images  of  the  same  scene  as  recorded  in  different  optical  wavelengths, 
at  different  times,  or  a  combination  of  both.  For  example,  an  ordi¬ 
nary  color  photograph  is  composed  of  three  bands — red,  green,  and 
blue.  Multispectral  imagery  is  another  example  for  which  the  same 
scene  is  imaged  in  several  bands  ranging  from  the  visible  to  the 
infrared.  Color  video  (e.g.,  television)  is  an  example  of  the  combi¬ 
nation  of  spectral  and  temporal  bands. 

Multi-band  images  (especially  color  and  multispectral)  character¬ 
istically  exhibit  significant  class  structure,  and  within  a  particu¬ 
lar  class  shows  high  spectral  correlations.  The  filtering  technique 
developed  during  this  research  exploits  both  the  class  structure  and 
the  intra  class  correlation  to  achive  noise  suppression. 

2.0  DEFINITIONS 

The  multi-band  image  is  viewed  as  a  vector  random  process 
P(x,  y)  where 


P(x,  y)  - 


P1(x,  y) 

p2(x»  y > 


P«(x,  y) 
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* 


* 


* 


t 


* 
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and  Pi(x,  y)  is  the  amplitude  of  the  ith  image  having  spatial 
variables  x  and  y.  We  define  the  N  x  N  spectral  covariance  matrix 
as 


C<x,  y)  =  E{P(x,  y)  PC(x,  y)}  (2) 

where  E  is  the  expectation,  t  indicates  transpose,  and 

E{P(x,  y)}  =  0  is  assumed.  We  also  assume  spatial  stationarity  so 

that 


£(x,  y)  =  C  . 


(3) 


The  general  stationary  covariance  matrix  for  the  process 
is 


K(t  .  =  E{P(x,  y)  PC(x  +  T  ,  y  +  T  )}  ,  (4) 

^  A  y  x  y 

which  includes  the  spatial  covariance.  The  (i,  j)  the  element  of 
K(rx,  Ty)  is 

ki,j(Tx’  Ty*  =  E*Pi(x»  y)  Pj<x  +  Tx’  y  +  Ty^  ' 

To  a  good  approximation. 


k.  ,  t  )  =  c,  .  R(t  ,  T  ) 

i,j  x’  y'  i,j  x’  y' 


where  c^  j  is  the  (i,  j)th  element  of  C  and 

R<V  O  -  E{Pi(x,  y)  Pt(x  +  Tx>  y  +  T  )}/o2  i  «  1,  2, 


(6) 


N  (7) 


is  the  normalized  scaler  autocovariance  function  for  all  elements 

of  P(x,  y).  That  is,  each  band  is  assumed  to  have  the  same  spatial 

2 

autocovariance  function  R(t  ,  T  )  and  variance  o  .  This  is  verified 

x  y 

experimentally  as  shown  in  Figure  1  where  we  show  the  (azimuthally 
averaged)  autocovariance  function  for  the  red,  green,  and  blue  bands 


00*59 


of  a  "typical"  test  image. 
From  (4)  and  (6), 
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K(t  ,  T  )  =  C  R(t  ,  T  )  .  (8) 

-  x  y  ~  x  y 

With  the  above  definitions  and  assumptions  we  next  derive  the 
general  covariance  matrix  for  the  principle  component  vector  a(x,  y) 

a(x,  y)  =  T  P(x,  y)  (9) 

where  T  is  the  N  x  N  matrix  of  eigenvectors  of  C.  Thus, 

~a(V  V  “  E{“(x’  y)  ac<x  +  Tx*  y  +  Ty)} 

=  E{T  P(x,  y)  Pfc(x  +  t  ,  y  +  T  )Tt} 

~  x  y 

=  T  K(t  ,  T  )TC  .  CIO) 

~  ~  x  y 

But  K(x  ,  T  )  “  C  R(t  ,  t  )  •  Thus  (10)  becomes 
~  x  y  -  x  y 

K  (T  ,  T  )  =  CT  C  TC)  R(T  ,  T  ) 

-a  x  y  -  ~  x’  y 


R(t  ,  x  )\ 
x’  y  ~ 


(ID 


) 

-5- 


►  £  since  C  is  diagonalized  by  the  orthogonal  similarity  transformation 

T  C  Tt  .  The  A^ ,  i  =  1,  2,  ...,  N  arc  the  eigenvalues  of  C  ordered 

such  that  A,  >  A„  >  ...  >  A.,. 

1  —  l  —  —  N 

Equation  (11)  is  a  key  result.  It  states  that  the  principal 
|?  component  images  arc  orthogonal,  that  their  variances  are  destributed 

as  the  A^,  and  that  their  individual  spatial  autocovariance  functions 
are 

Ri  <V  V  “  E*ai(x»  y)  «,(x  +  Tv*  y  +  V} 

a  y  y 

-  A.  R(t  ,  T  )  i  =  1,  2 . N.  (12) 

a  x  y 

That  is,  each  R.  (t  ,  r  )  is  a  scaled  (by 
o  X  y 

original  spatial  autocovariance  function, 
is  developed  next. 


A^)  replica  of  the 
The  impact  of  this  result 


o 


-6- 


3.0  DERIVATION  OF  THE  MATRIX  WIEN'KR-iiOPF  EQUATION 

lo  determine  the  filter  generating  the  minimum  variance  linear 
estimate  of  a  vector  picture  process  received  in  the  presence  of 
additive  white  noise  the  following  problem  formulation  is  needed. 
Suppose  £(>0  is  the  n-element  vector  picture  process,  with  x  =  (x,  y) 
the  generic  point  on  the  picture  surface,  and  n(x)  is  the  additive 
white  observation  noise.  Then,  the  problem  is  to  find  a  matrix 
spatial  impulse  response  (x,  y)  such  that 


all  H(x,  y) 


E{||p(x)  -  f  ll(x,  £)[P(y)  +  n(y)]d;>rj|  } 

v)  J 


=  E{||p(x)  -  J  H^°\x>  y >  t£(v)  +  n(y)]dv||  } 


for  all  jc  in  the  2-dimensional  area  A  defined  by  the  picture.  Here 
"all  H(x,  £)"  means  all  square-integrable  H(x,  y) ,  i.c.,  all  matrices 
H(x,  ^“[{h^Gc,  y)}™  such  that 


// 


H(x,  £)  (I  dxd^ 


A  A  . 


fft 


A  A  i, j=l  ^ 


h  (£,  y)  dxd£ 


<  +  00 


For  convenience,  P(jc)  +  n(x) ,  the  received  picture  process,  will  be  denoted 
by  _r  (x) : 

r(x)  ■  P(x)  +  n(x)  . 


Here 


E(r  (x)rC(y) ]  =  E(P (x)Pt (y) ] 


+  ElnWn^)!)]  =  E[P(x)PC(x)3 


+  2Nq<>(x-x)I 

{i  =  {6  }  and  for  any  v,  vC  is  the  transpose  of  v}. 


(1A) 


,(0), 


Now  it  will  be  verified  that  if  H'  ' (x ,  y)  satisfies  a  matrix  analogue 
of  the  scalar  Wiener-Hopf  integral  equation  (x,  y)  does  indeed  provide 
the  minimization  of  mean  square  error  described  by  (1).  It  will  be  assumed 


that 


E{[P(x)  -  yV0)(x>  yJrfyOdy.lr^z)}  =  0,  all  z  in  A  (15) 


Note  that  (15)  is  equivalent  to 


R£,£(— ’  & 


H(0)(x,  x)Rr>r(z*  £>dZ  =  0 


(16) 


where 


R£  r (2i*  £)  “  E[P(x)rt(£)] 


and 


R  (y_.  z)  -  E[r(y)rC(z)  ] . 

£>£. 

Thus,  (15)  is,  in  the  form  (16),  a  matrix  Wiener-Hopf  equation. 

Write  the  arbitrary  square  integrable  H(x,  £)  as  a  perturbation 
of  H^°\x,  x); 


t 
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H(x,  x)  =  If  '(x,  x)  =  A(x,  x> 


Then,  the  following  sequence  of  equalities  holds,  where  the  operator  "tr’ 
is  defined  by  the  equation 


tr({m. .})  =  /  m . . ,  all  matrices  {m..} 

ij  ii  ij 


E{  ||  P(x)  -  f  H(x,  y)r  (^)dy  ||  } 

•'a 

=  E(tr[(P(x)  - J ' H^°\x,  y)r(y)djO(P(x)  "  J "  H(0)(x,  Z)Z(z)dZ)C]} 


=  E{tr[(P(x)  -  /  H^(x,  X^lWdZ.  ~  f  A(x,  Z)£(z)dZ) 

A  ‘'A 

x  (IW  -  f  H^(x,  z)£^Z)dZ  "  /  ^.(x,  Z^ZCZ^Z^H 
•'a  "a 

■  E{tr[(P(x)  -  y  H^°\x>  Z>Z^Z) dZ>  (£  (x)  ~f  Z^Z^Z^ 

A  *^A 

-  (£<*>  -  /  h(0)(x,  z)z(z)dz>( /  *(*.  z)z(z>dz)t 


+  E  { t  r  { — 


^)r(^-)dx)  (P(x) 


Cx,  x)i(y)dv)t 


+ 


Mx.  ^)r(^)dy) ( 


2)r(£)dx)n]} 


(17) 


Since  the  "tr"  operator  is  linear  and  is  invariant  under  matrix  trans¬ 
position,  (17)  becomes 


E(||  P(x) 


*  E{tr[(P(x)  - 


(x,  y)r(^)d^)  (P(x  - 


(x. 


^)r(^)  d^)*]} 


-  2E(tr[(P(x) 


(x»  z >i(z)dz)  ( f  A 

•'a 


(x,  z)£(z^dz>t^ 


E{tr[( y  A  (x,  y)_r  (x) dZ) ( / ^  (x,  Z^lXjL^Z)^ 


(18) 


But,  first. 


E{tr((P(x) 


(x,  Z)r(Z)dZ)  (P(x) 


(x,  2)l(Z^dZ)t^ 


E{ |1  P(x)  -  f  H(0)(x.  Zil^dzli2} 

Jk 


(19) 


Moreover, 


E{tr[(P(x)  - 


(x,  £>I_(z)dZ)(  f  A(x. 

,'h 


y)r_ (yjdy)*1] } 


*=  E{tr[ [  /*l(P(x)  -  y)  ir  (y) di^£C (?)  1 A* (x .  l)d£.)^ 

A  A 

=  tr  { £ 2[  (P(x)  “  (~*  C^-^— t^-’  -^d— ^ 


Finally,  using  (14), 


E(tr[( ^ A(x,  ^)_r(x)dZ>( ^ Mx,  y)r_(y)dy) t] > 


»  E{tr[(  /*  A(x,  y)r (y)dy)  (  J  Z^dX^l^ 

4  ”a 

=  E{tr[  j"  Jh(x,  (z)^ (x,  £) dydzj ) 

A  A 

■  tr [  f  /a(x,  y) E [£.(z)l.t 1  A* (* »  £)dZd£l 
•'a  /a 

-  tr( y* /  A(x,  ^)E[P(^)£t U)  ]At(x,  z)dydz) 

A  *^A 


» 


(20) 
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+  tr 


'//  A(x,  l)  z)  d  vd  £ 

A  A 


=  E{tr[(  J A(x,  x)P(x)d)r)( £ '  Mx.  Z)P^dX)Cn 


+  tr 


{(Area (a)) A(x,  jOA^x,  x)d£] 


=  E{||  A(x,  X)P(£)dy|f}  t  Area  (A)  J"  ||  A(x,  %)  j| 


(21) 


where 


3 

II  x)  ||2  -  23  aij  ♦ 

i»j=l 


Hence,  combining  (18),  (19),  (20)  and  (21),  there  results 


E(||P(x)  -  f  H(x,  X>l(l)dJllf>  *  E(||P(x)  -  /H(0)(x.  JL)£(x)dlH2I 

Jk  Jk 

+  tr{  J  E[(P(x)  -  (x ,  x)dx)lC  (z)  <2L»  z)dz) 


+  E(  1|  J A(x,  x)P(x)d^||2)  +  Area(A)  J'  ||  A(x,  ■£)  ||2dx  (22) 


■A  - 


-12- 


Then,  invoking  the  Wiener-Hopf  equation  (15),  there  results 


E{||P(x)  -  f  H(x,  £)r(£)dZ||2}  “  F-{||  P(x)  -  ^ H(0)  (x,  v)r(y)dX||2} 
•'a  A 

+  E{||  J  A(x,  £)P(y)dy||2}  +  Area  (A) J"\ ||  A(x,  y)  ||2d£  (23) 


Thus,  whenever  for  some  i^  and  j 


/ 


A?  4  (x.  z>dz  >  0, 

0J0 


it  follows  from  (22)  that 


E{||  P(x)  -  /  H(x,  x>£(z)dzl|2>  >  E(||P(x)  -  f^\,  Z>£(z)dzl|2> 


But,  when  for  all  i  and  j 


/ 


Z>dZ  ■  °» 


i . e . ,  when 


e(  || 


/II  A(x.  Z>  II2  "  °» 

*A 

P(x)  -  f  H(x ,  x)£<Z>dzll  >  “  Ef  ||  PCx)  -  J  H(0)(x,  jf)r<jr)  ll*> 

*  a 


by  (22). 


-13- 


Hencc,  not  only  does  H^(x,  minimize  E { ||  P(x)  -  \\(x,  z)j"(z)dv  |f} 

A 

but  any  minimizing  H(x,  z.)  has  the  property  that 


J  J II  H(x,  y)  -  H^°\x,  y)  |f dxdz 


A  A 


//' 

A  A 


A(x,  z>  ||  dxdz 


-  o, 

i.e.,  H(x,  y)  differs  from  H^(x,  z)  by  a  null  function. 

Thus  a  solution  of  the  Wiener-Hopf  equation  (15)  (or  (16))  not  only 
minimizes  the  mean  square  error  but  it  is  unique  in  that  respect  to 
within  a  null  function.  This  means,  in  addition,  that  if  the  Wiener-Hopf 
equation  has  one  square- integrable  solution  it  is  unique  to  within  a 
null  function. 

To  see  that  the  existence  of  a  solution  of  the  Wiener-Hopf  equation 
is  a  necessary,  as  well  as  sufficient,  condition  for  the  existence  of 
a  minimizing  choice  of  H(x,  %)  one  must  simply  analyze  equation  (22) . 

This  equation  is  true  independent  of  whether  or  not  H^(x,  y)  is  a 
solution  of  the  Wiener-Hopf  equation.  But  if  (x,  y)  is  not  a  solu¬ 
tion  of  the  Wiener-Hopf  then  set 


Mx,  z)  *  -£M(x,  z) 


where  £  is  an  arbitrary  positive  number  and 


(24) 


M(x,  z)  -  E[(P(x)  -  /  H(0)(x,  z)dZ)lt(£)] 

A 


(25) 
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In  this  case. 


tr{  J  E[(P(x)  -  f  H^(x,  jOdyJr^Cz)]  AC(^,  ^)dz 

A  “'A 


-etr{ J"  M(x,  z)MC(x,  z)dz) 


=  -e  J "  tr(M(x,  £)MC(x,  £>}d£ 


m  J"  li  £)  H2d£ 


As  a  result  of  (26) ,  (22)  becomes 


S{||  P(x)  -  /  H(x,  Z>£<Z)dzlP>  -  E(||P(x)  -  y  H(0)(x,  z)£(z)dzl|2> 


||  M(x,  z)  ||2dz  +  E{||  /  (-e)M(x,  £)P(z)d£ll2 


+  Area (A)  /  ||  (-e)M(x,  II2 djr 


P(x)  -  J  H^°*(x,  ^r^djrll2}  -  e  J  II  M(x,  z)||2dz 


hi 


hi 


'{E{||  j*  M(x,  £)P(^)d^||2}  +  Area  (A)  J*  ||  M(x,  v) \f  d%)  (27) 


Assuming  that  ttK  ' (x,  is  not  a  solution  of  the  Wiener-Hopf  equation 
and  remembering  the  definition  of  M(>?,  y) ,  (25)  makes  it  clear  that 


-£/ 


M(x,  z)  |f  dz  <  0 


Thus,  for  e  sufficiently  small  (27)  shows  that 


E(||P(x)  -  f  H(x,  z)r(X)dX|p}  <  E(||P(x)  -  J  H(0)  (x,  £)r(Z)dZ||2} 


where,  by  (24)  and  (25), 


H(x,  z)  =  ' (x ,  z)  +  &(x,  z) 


H^(x,  z)  -  EE[(P(x)  -  J  H^(x,  yJrCy^dyJr^z)  ] 


Hence,  it  has  been  shown  that  if  Hv  ;  (x,  y_)  is  not  a  solution  to  the 
Wiener-Hopf  equation  tt^(x,  y)  does  not  minimize  the  mean-square 
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4.0  SOLUTION  UNDER  THE  ASSUMPTION  OF  STATIONAKITY  AND  NON-CAUSALITY 

If  we  allow  the  matrix  filter  to  be  non-cnusal  and  further  assume 
spatial  stationarity ,  we  find  the  result, 


II  (u>  .  to  )  =  TC  G  (to  ,  w  )T 
A  y  **  A  y  - 


(28) 


where 


r  X1S(V  V 

Xls(“x'  “y1  +  °n 


G(ux,  u>y)  = 


XHS<V  V 
V'V  “y>  +  ° 


N 


(29) 


and  S(to  ,  to  )  is  the  spatial  power  spectral  density  of  the  original 
X  ^2 

bands,  and  o  is  the  noise  variance.  This  matrix  filter  consists 
2 

of  N  filters  given  by 


N 


Hi  j(Wx»  “y)  *  ^  ek/i  ek,j 


k=l 


Xn  S(V  V 
Xn  S(V  V  +  °n> 


(30) 


where  e^  ^  is  the  k,i-th  element  of  T  (i-th  element  of  the  k-th 
eigenvector  of  C) . 

2 

Implementation  of  the  N  filters  requires 
°H  =  N2Ng(l  +  4  log2Ns) 

operations  (multiplications  and  additions),  where  N<,  is  the  number 
of  samples  per  line  in  each  of  the  original  (square)  images.  If  we 
implement  the  filters  using  the  G  matrix  preceded  and  proceeded  by 
the  K-L  transform  as  indicated  in  (28),  then  the  number  of  operations 
is 


°K-L  "  NNsl2N  +  +  4  1o82NS)]  ‘ 


Clearly, 


In  fact. 


°K-L 


<  1  for  N  >  2, 


NS  - 


Lim 

M  ->oo 

S 


K-L 


1 

N 


and  the  filter  implementation  in  (28)  results  in  an  approximate  1/N 
reduction  in  the  number  of  operations  for  typical  pictures  (N  >  256). 

v)  “ 

For  example,  with  N_  «  256  and  N  *  3  (color  data),  0_/0„  =  1/2.54 
(7.68  x  10°  ops  versus  19.5  x  10°  ops). 

5.0  CLASS-DEPENDENT  K-L  TRANSFORMATION 

The  presence  of  linear  class  structure  within  multiband  data 
suggests  the  use  of  K-L  transforms  matched  to  the  covariance  matrices 
of  each  class.  In  so  doing  we  achieve  a  piece  wise  linear  approxi¬ 
mation  to  a  non-linear  transformation  of  the  multiband  data,  and 
approximate  the  "intrinsic  dimensionality"  of  the  multiband  data. 

The  result  is  a  further  compression  of  data  variance  into  a  smaller 
number  of  principal  components  compared  to  the  global  K-L  based  on 
a  single  covariance  matrix  computed  over  the  entire  picture. 

Our  approach  to  classification  is  based  on  a  simple  unsupervised 
clustering  algorithm  in  which  an  initial  (random)  guess  of  both  the 
number  of  classes  and  their  mean  vectors  are  iterated  until  the 
mean  vectors  remain  constant  and  the  inter  class  distances  (Bhatta- 
charyya  distance)  satisfy  a  threshold.  Upon  convergence  of  the 
algorithm  we  have  the  number  of  class  and  their  covariance  matrices. 
This  information  is  used  to  compute  the  required  set  of  K-L  trans¬ 
forms,  and  to  implement  a  conventional  liklehood-ratio  hypothesis 
test  for  each  data  vector  comprising  the  multiband  image. 

Once  the  data  is  classified  we  then  perform  the  class  dependent 
K-L  transformation.  The  MSE-optimum  spatial  filters  are  applied 
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to  the  principal  component  images,  followed  by  inverse  transformation. 

The  effectiveness  of  the  above  procedure  has  been  verified 
using  a  color  test  image  (3  bands:  red,  green,  and  blue)  of  an 
F-16  aircraft  obtained  from  the  USC  Image  Processing  Institute. 

In  Figure  2  we  show  histograms  (estimates  of  the  probability 
density  function)  of  a)  the  original  data,  b)  global  principal  comp¬ 
onents  and  c)  the  class  dependent  principal  components.  Note  the 
significant  reduction  in  variance  in  c). 

For  this  particular  date  set  the  clustering  algorithm  generated 
five  classes.  These  classes  are  depected  in  Figure  3.  The  unique¬ 
ness  of  each  class  is  somewhat  evident  in  Figures  (4)  and  (5)  where 
we  show  a  two  dimensional  projection  of  the  three  dimensional  joint 
density  function  for  the  original  bands.  Note  the  varying  directions 
of  maximum  variance  for  each  class,  and  therefore,  the  appropriateness 
of  separate  K-L  transform  for  each  class. 


b)  Global  principal  components 


Intensity 


c)  Class-dependent  principal  components 


PC-3 


Fig.  2— Comparison  of  density  functions 


0 


Green  band 


Fig.  4— Contour  plot  of  red-green  joint  density 


Fig.  5— Three  dimensional  view  of  Fig.  (4) 


